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ABSTRACT 


Naval Aviation aircraft mishaps continue to be of great 
Goncern due to the high cost of loss of life and aircraft. 
The goal of this thesis is to develop a predictive statistical 
model that accurately forecasts Marine Corps AV-8B Harrier 
aircraft mishaps based on existing monthly maintenance 
reports. Monthly maintenance reports provide numerous 
independent variables based on personnel levels and 
maintenance hours that could possibly be used to forecast 
aircraft mishaps. These variables were graphically analyzed 
to determine any relationships that could be exploited in 
developing the model. Higher order relationships were 
investigated by the method of principal components and 
logistic regression. After a thorough analysis, there appears 
to be no combination of variables in this particular data that 
could be used to forecast aircraft mishaps. The overall 
result of the thesis is that there is no relationship between 
monthly maintenance reports and aircraft mishaps that can be 


exploited to develop a predictive statistical model. 
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EXECUTIVE SUMMARY 


Aircraft mishaps continue to be a major concern to the 
Marine Corps due to the high costs associated with the loss of 
life and aircraft. A predictive statistical model or 
Quantitative formula that identifies, on the basis of prior 
months maintenance reports, a squadron at risk of having a 
mishap would greatly enhance the commanding officer’s ability 
to prevent mishaps. This thesis attempts to develop a 
predictive statistical model which identifies high risks 
squadrons based on existing monthly maintenance reports. That 
1s, we want to attempt to identify a set of conditions in 
previous months maintenance records which presage with high 
probability a mishap in the next month. Every squadron 1s 
required to submit monthly maintenance reports that detail the 
type and amount of maintenance performed on each aircraft in 
that month and report maintenance personnel levels within the 
Squadron. Many experienced people involved in Naval Aviation 
believe that they should be able to use these monthly reports 
to identify a squadron at risk of having a mishap. 

The Marine Corps is looking for a predictive statistical 
model that includes all aircraft types, but because of 
possible different operating environments and procedures 
between aircraft types, this thesis focuses on one particular 
epereratt . If a powerful predictive statistical model is 
developed for this particular aircraft, then there is hope 
that the analysis and the statistical model could be expanded 
to include all aircraft types. The scope of the thesis has 
been narrowed to developing a predictive statistical model for 
the Marine Corps AV-8B Harrier aircraft. 

The overall goal of the predictive statistical model is to 
identify high risk squadrons based on existing monthly 
maintenance and personnel reports, and not _ to determine the 


cause of mishaps. The statistical model will not determine if 


nip 


a squadron is doing the Correct amount of maintenance oie 
the squadron is adequately manned, but rather given the 
reported numbers, is the squadron at high risk of having a 
mishap. 

The predictive statistical model will be developed by 
determining in which of the variables, or combination of the 
variables, there is a significant difference in the previous 
months maintenance pattern of a mishap and a non-mishap 
squadron. These variables can then be used with various 
Statistical prediction and classification methods to attempt 
to forecast Oly we cuadronc. 

A graphical analysis indicated that there were no one or 
two dimensional relationships that could be used to classify 
a mishap squadron. And furthermore, the techniques of 
principal components and logistic regression did not produce 
any higher order relationships that could be used to classify 
a mishap squadron. 

Based on this particular analyzed data there apparently is 
no relationship between existing monthly maintenance reports 
and aircraft mishaps. This may indicate that there is no 
relationship between the level of maintenance and mishaps, but 
the results also might indicate that a monthly generated 
report may not be useful in predicting an aircraft mieieer 
The fact that the data 1s reported once a month, at the end of 
the month, could conceal any useful subtle changes or 
indications of a high risk squadron that occur during the 
inn@roucley. 

Two alternative recommendations are evident. The first 
alternative 1S to accept that there may be no exploitable 
relationship between monthly maintenance reports and aircraft 
mishaps and focus elsewhere to determine a predictive 
statistical model that forecasts aircraft mishaps. The second 
alternative recommendation is that further analysis be done, 


possibly attempting to use daily maintenance reports versus 


monthly maintenance reports to determine a predictive 


statistical model that forecasts aircraft mishaps. 


ane 





I. INTRODUCTION 


A. PROBLEM STATEMENT 


Aircraft mishaps continue to be a major concern to the 
Marine Corps due to the high costs associated with the loss of 
Pees Sand -alrcratt. A predictive statistical model or 
Quantitative formula that identifies, on the basis of prior 
months maintenance reports, a squadron at risk of having a 
mishap would greatly enhance the commanding officer’s ability 
to prevent mishaps. This thesis attempts to develop a 
predictive statistical model which identifies high risks 
squadrons based on existing monthly maintenance reports. That 
1s, we want to attempt to identify a set of conditions in 
previous months maintenance records which presage with high 
probability a mishap in the next month. Every squadron is 
required to submit monthly maintenance reports that detail the 
type and amount of maintenance performed on each aircraft in 
that month and report maintenance personnel levels within the 
Squadron. Many experienced people involved in Naval Aviation 
believe that they should be able to use these monthly reports 
to identify a squadron at risk of having a mishap. 

The following 1s a problem statement from a September 1993 
Marine Corps aviation safety standdown: 

Mr TOpicC: identity high risk aircraft units. 

2. Discussion: Commanders must understand and use all 


available statistical and subjective readiness indicators 
to evaluate the risk level of their operational aircraft 


units. Many readiness indicators are available for 
Commanders to effectively evaluate and strengthen unit 
readiness, but may not be consistently used. Commander 


Marine Forces Pacific (MARFORPAC) recommends the Naval 
Safety Center develop a quantitative formula that assigns 
risk values to leading indicators which can be used to 
identify high risk squadrons and forecast and manage 
risk. 


Be “Action: Safety Division, using the resources 
available at the Naval Postgraduate School and Naval 


Safety Center, develop a quantitative formula which 
assigns risk values to squadron aircraft wtilizaticnwmas 
manning rates, mission capable rates, Status of Resources 
and Training System (SORTS) data, and operations tempo, to 
identify high risk squadrons. [Ref. 1] 

The Marine Corps is looking for a predictive statistical 
model that includes all aircraft types, but because of 
possible different operating environments and procedures 
between aircraft types, this thesis focuses on one particular 
ane teat: If a powerful predictive statistical model is 
developed for this particular aircraft, then there is hope 
that the analysis and the statistical model could be expanded 
to include all aircraft types. The scope of the thesis has 
been narrowed to developing a predictive statistical model for 
the Marine Corps AV-8B Harrier aircraft. 

There are obviously thousands of influences on aircraft 
mishaps but this thesis focuses on just existing monthly 
maintenance reports. It 1S conjectured that probably the 
greatest influence on aircraft mishaps is that of the 
commanding officer’s attitude concerning safety. However this 
1s impossible to quantify and is not included in this study. 
The operations tempo of a squadron may also greatly influence 
mishaps but is difficult to quantify, even as a categorical 
variable, and an acceptable operations tempo variable was not 
found to include in this thesis. For the preceding reasons, 
any model developed may not be a powerful model in forecasting 
mishaps, but could be used as a tool for commanding officers 
to help identify a squadron at risk of having a mishap. 

The overall goal of the predictive statistical model is to 
identify high risk squadrons based on existing monthly 
maintenance and personnel reports, and not to determine the 
cause of mishaps. The statistical model will not determine if 
a squadron is doing the correct amount of maintenance or if 


the squadron is adequately manned, but rather given the 


reported numbers, is the squadron at high risk of having a 
Mesniap . 

The predictive statistical model will be developed by 
first determining in which of the variables, or combinations 
of the variables, there is a significant difference in the 
previous months maintenance pattern of a mishap and a non- 
mishap squadron. These variables can then be used with 
various statistical prediction and classification methods to 


attempt to forecast high risk squadrons. 


B. HISTORICAL BACKGROUND 


A Defense Technology Information Center search did not 
produce any related references on the topic of predicting 
aircraft mishaps based on monthly maintenance reports. A 
report titled "Marine Corps Aviation Mishap Rate Assessment 
Study" dated February 1992 includes some analysis of a similar 
problem. [Ref 2.] The study attempted to explain why the 
Marine Corps 1990 mishap rate was alarmingly high. 

One section of the study tested the hypothesis that there 
exists a high correlation between increases in Direct 
Maintenance Man Hours per flight hour and the increase in 
mishap rate for 1990. For the test, data on Not Mission 
Sryeaole Supply, cannibalization, aircraft utilization, and 
mishap rates were presented to the Naval Safety Center, 
Statistics and Mathematics Department for analysis. The study 
team was not able to demonstrate a correlation between 
aircraft utilization rates and support resources as 
independent variables and mishap rate as the dependent 
variable. The study team concluded: 


It is still intuitively appealing that there is a 
relationship and experts in the field, the operators 
and senior officers, firmly believe that the 
relationship is valid. (Ref. 2] 


The study suggested two different alternatives. The first 
alternative is to concede that the relationship between 
utilization, resources and mishaps is undefined, and perhaps 
unimportant. The second alternative 1S to continue research 
to define the relationship between aircraft utilization, 
Support resources like direct maintenance man hours per flight 
hour and parts, and the mishap rate. 

This thesis approaches the latter alternative. The 
previous study includes all Marine Corps aircraft combined and 
focused on the relationship with mishap rate. This thesis is 
defined more in that it focusses on one particular aircratt 


and attempts to predict mishaps, rather than mishap rates. 


C. TECHNICAL BACKGROUND 


The goal of any statistical model developed would be to 
accurately classify a squadron as a mishap or non-mishap 
Squadron in the next month based on the current month 
maintenance reports. The monthly maintenance report data 
consists of numerous maintenance variables that are believed 
to possibly influence mishaps. Hopefully, a function can be 
developed which uses these predictor variables to classify a 
Squadron as a mishap squadron. Therefore, a discriminate 
function 1S needed that projects some combination of the 
predictor variables to a decision space that classifies the 
Squadron as a mishap squadron or not. An example is the 


following linear additive model: 


D* = £ (nef (ae a ee eee (2) 


where, D*® = decision space (in k-space) 


x, = ith independent predictor variable 
a; = iIthseéoer frememe 
2 = 2, .. =e 


In other words, if given the function f(x) and a new set 
of predictor variables x, the model would either classify a 
Squadron as an element of the acceptance region of the mishap 
decision space, DX, or not. A graphical explanation is shown 
in Figure 1. The idea is to develop a function that maps the 
n-space independent predictor variables to an outcome, or 
decision space, that 1S partitioned into an accept and reject 


region so as to determine if a mishap may occur. 


Product space (n-space) 


shap o Mishap 
Outcome space 





Figure 1. Mapping n-space to the outcome space. 


Identifying the function capable of this classification is 
not the only problem. Any statistical model developed from 
this function must be accurate in its forecast so that the 
model will be useful. But the statistical model also needs to 


minimize the probability of making errors. 


The two types of errors that are of concerm are type I and 
type II errors. A type I error is defined as rejecting that 
the outcome is from the event population, when it actually is 
from the event population. In this statistical model a type 
I error is when a squadron is classified as a non-mishap 
squadron when it 1s actually a mishap squadron. The 


probability of a type I error 1s given by 


«® = Pr(predict non-mishap | actually a mishap) . ay, 


A type II error is defined as accepting that the outcome 
is from the event population when it actually is not from the 
event population. In this statistical model a type II error 
1s when a squadron is classified as a mishap squadron when it 
1s actually not a mishap squadron. The probability of a type 


II error 1S given by 


B = Pr(predict mishap | actually no mishap) . (oy 


Obviously the type I error is the more serious of the two 
errors in this statistical model since a mishap occurs that 
was not predicted. But a high probability of a type II error, 
although no mishap occurred, can render the model useless. If 
the probability of a type II error is high, it means that the 
model 1s predicting an unacceptable number of squadrons as 
mishap squadrons when they are non-mishap squadrons. 

Any model developed needs to minimize the probabilities of 
the type I and type II errors as much as possible, while still 
providing accurate predictions. The two types of errors are 
interrelated in that if one type of error 1S minimized it is 
usually at the expense of the other. Generally, if the 
probability of a type I error 1s minimized, while ignoring the 
probability of a type II error, the probability of making a 
type I error may be satisfactory but the probability of making 
a type II error will be unsatisfactorily high. in that 


statistical model this may result in an acceptable level of 


type I errors, failing to predict a mishap when a mishap 
actually occurs, but an unacceptable level of type II errors, 
predicting a mishap when a mishap did not occur. Obviously 
the type I error would be the lowest 1f all Squadrons were 
predicted as mishap squadrons, because there would be no type 
I errors. But the type II errors would be maximized, since 
most of the squadrons would have a false alarm, rendering the 
model useless. 

Dividing the data into mishap and non-mishap observations 
creates two separate populations with numerous independent 
predictor variables. Marginal analysis of each of these 
univariate independent predictor variables from the separate 
populations can determine if there exists a significant 
difference between a mishap and non-mishap squadron with 
respect to that particular variable alone. For example, maybe 


the classification is a function of just one variable, i.e. 


Deer x) =. (a x). (4) 


To determine if there 1S a significant difference in the 
distribution of a variable among two populations it 1s assumed 
that the two populations have similar distributions with 
possibly different parameters. To graphically show 
differences, the density traces of the variables from each 
population are superimposed on the same density plot. Any 
Significant’ differences can be determined by comparing the two 
traces. 

For example, this technique could be used if trying to 
determine significant differences ina predictor variable from 
separate populations, non event and event observations. 
Figure 2 shows two superimposed density traces of a variable 
from two separate populations that show the obvious 
Significant difference of the event observations variable 


being larger than the non event observation variable. In this 


example the plotted variable could possibly be used to 
classify an observation as an event or non event by setting 
the rejection region at w. Thereby accepting that a new set 
of values come from the non event population if the outcome is 
less than w. As can be seen in this example, a model using 
the example variable would be very powerful, with a low 
probability of both types of error. But if the density traces 
shift so that they are now overlapping more, then using the 
same w will result in the exact same type I error while the 


type II error will increase dramatically. 
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Figure 2. Density trace comparisons of one 


dimensional data with a significant difference 
in population density. 


On the other hand, Figure 3 shows two superimposed density 
traces of a variable from two separate populations that show 
no obvious significant differences between non event and event 
observations. In this example, there 1S no way that this 
variable could be used to classify a squadron as a mishap or 
non-mishap squadron because there 1s no rejection region that 


can be identified that could be used to distinguish between 


EVENT OBSERVATIONS 
NON EVENT OBSERVATIONS 


DENSITY 
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Figure 3. Density trace comparisons of data with 
no significant difference in population density. 
the two populations with a high degree of accuracy. 

The above discussion uses just an analysis of the 
univariate independent predictor variables to attempt to 
classify an observation as an event or a non event. Piel iee 
also possible that combinations of independent variables may 


produce the function that classifies the dependent variable as 


ine aefersnegievol ll Producing a coded scatter plot of each 
independent variable versus each other independent variable 
may produce a clustering of observations that could be used to 
classify the dependent variable as an event observation. A 
coded scatter plot provides a three dimensional display by 
having the two independent predictor variables plotted against 
each other and having separate symbols showing event and non 
event observations. This provides an easy way to determine if 
any observations are clustering, 1.e., 1f£ most of the event 
observations are grouped together it shows that the 
combination of variables may produce a model that can classify 
the observation aS an event or non event. 

Figure 4 shows an example of two independent variables, x 
and y, that are being used to attempt to discriminate between 
two populations on the basis of x and y. A plot of x and y 
with the two separate populations coded could show any 
clustering of the dependent variable. As can be seen in 
Figure 4, there 1S no rejection region that can be used to 
separate the two populations and classify an event or non 
event with a high degree of accuracy. 

Figure 5 shows that when the observations are from the 
event population all of the observations are in a tight and 
separated cluster. This shows the possibility of using x and 
y to classify an observation as an event or non event. As can 
be seen in Figure 5, by setting the rejection region at the 
indicated line, the event and non event observations can be 
accurately classified. For example, the indicated rejection 
line in Figure 5 1s a function of x and y that maps to a point 


in a two-space decision space 


D* + £(x,y) = ae anys GS) 


where, a, and a, are the coefficients of x and y. 
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Figure 4. Coded scatter plots showing no breakout 
or clustering of event observations. 


So, given any x and y, the function will map the 
observation onto the decision space and if the point lies 
below the acceptance region dividing line then that 
observation is classified as an event. Whereas, if the point 
lies above the acceptance region dividing line then that 


Miesemvabion 1S classified as a non Gvent. 


Obviously, higher order combinations of the function can 
provide the predictive statistical model. Instead of 
graphical analysis, the higher order functions are 
investigated by multivariate techniques such as discriminate 


analysis, logistic regression, and cluster analysis. 
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Figure 5. Coded scatter plot showing a significant 
breakout or clustering of event observations. 
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II. DATA OVERVIEW 


A. DATA DESCRIPTION 
1. Mishap Data 


The aircraft mishap data was provided by Headquarters 
Marine Corps Aviation Safety Division and includes data on 
nine AV-8B Harrier squadrons over the time period of January 
1990 to November 1993. The data consisted of the date, 
severity, squadron, and brief description of all Flight 
Mishaps involving Harriers in this period. A naval aircraft 
Flight Mishap is defined as an unplanned event directly 
involving naval aircraft which there was $10,000 or greater 
g@igeerart Gamage, or loss of aircraft, and intent for £light 
existed at the time of the mishap. Table I shows the 
definitions of the mishap severity classes based on personal 
injury and property damage. Any occurrence in which total 
cost of property damage is less than $10,000 and there are no 
defined injuries, 1S not considered a reportable naval 
aircraft mishap. 

The description of the mishap is an excerpt from the 
Mishap Investigation Report that provides a short narration of 
the causal factors of the mishap. The causal factors can be 
divided into three basic categories. The first 1s mishaps 
caused by human factors, 1.e., human error by the aircrew, 
Supervisory personnel, maintenance personnel, or facilities. 
The second factor 1S a material failure, 1.e., a component 
fails causing the mishap. And the last 1s mishaps caused by 
agmeatreraft hitting a bird. 

All three severity classes of mishaps (A, B, and C) were 
combined to form a dependent variable that indicates if a 
squadron had a mishap in a month or did not have a mishap in 
@@ae month. All casual factors were combined except for the 


birdstrike mishaps. Since there is no credible way to predict 


iS 


= P 


_ Mishap Pesecrleraen 
| Severity 


(mishap 1 whven the" total ces a1 
property damage is $ 1,000,000 or 
greater; or a naval aircraft is 
destroyed or missing; or any fatality or 
permanent+totad disabi duty oecurs wath 
Girect itivolvement of naval aircratt. 


Class B A mishap in which the total cost of 
property damage is $ 200,000 or more, 
but bess than sS 1,000, 0007we: ea 
permanent partial disability, or 

| hospitalization of five or more 


A tae ie ae eae dD 
OAS SS ge 3 oS Se 


Class C "A mishap 2m which tthe total *coatmem 
property damage is $ 10,000 or more, but 
less than $ 200,000; or injury results 
in one or more lost workdays. 





Table I. Classifications of Naval Aircraft Mishaps. 
From Rel joi. 


birdstrike mishaps, they were not considered a mishap month in 
the analysis. All of the remaining mishaps observations were 
included in belief that the mishap observations and 
independent predictor variables could be used to develop a 
Statistical model that can discriminate a mishap and non- 
mishap squadron based on monthly maintenance reports. ile 
Ehnreer separate caso Squadron that had two mishaps in the 
same month was included as a single observation of a mishap 
1 tae rele 


2. Maintenance Data 


The maintenance data was provided by the Naval Safety 
Center through the Naval Aviation Logistic Data Analysis 
system. This data consisted of the Equipment Condition 
Analysis report and the maintenance man hours per flight hour 


for the nine squadrons. The Equipment Condition Analysis 
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report data consisted of the reported Aviation Maintenance and 
Material Management (3M) system data for each squadron in each 
month. The amount of maintenance hours 1s divided into 
separate categories based on the information on the 
Maintenance Action Form. The Maintenance Action Form is the 
paperwork that describes particular maintenance to be done and 
assigns the maintenance to the appropriate work center [Ref. 


4}. Included in this data for each Squadron is: 


1. Date by month from January 1990 to November 1993. 


Ly Average number reporting inventory: average number 
of aircraft assigned in each month. 


fewer light Nours:; total Elight hours in each month. 


42, Number sorties: total number of flights in each 
motte fh. 


5. Number landings: total number of landings in each 
merich. 


6. Hours Equipment in Service: total number of hours 
Beat. the aircraft were available for use in each month. 


7. Hours Not Mission Capable Maintenance-Scheduled: 
total number of hours that aircraft were not capable of 
performing any of their missions due to scheduled 
maintenance requirements in each month. Scheduled 
maintenance is the periodic prescribed inspection/ 
servicing of equipment, done on a calendar or hours of 
operation basis. An aircraft 1S considered Not Mission 
Capable Maintenance-Scheduled only if panels and 
equipment removed to conduct area inspections cannot be 
replaced within two hours. 


8. Hours Not Mission Capable Maintenance-Unscheduled: 
total number of hours that aircraft were not capable of 
performing any of their missions due to unscheduled 
maintenance requirements in each month. All not mission 
capable maintenance hours that are not Not Mission 
Capable Maintenance-Scheduled are classified as Not 
Mission Capable Maintenance-Unscheduled. Unscheduled 
maintenance 1s performed when corrective maintenance 1S 
required. 


ils: 


9. Hours Not Mission Capable Supply: total number of 
hours that aircraft were not capable of performing cus 
of their missions because maintenance required to clear 
the discrepancy cannot continue due a supply shortage. 


10. Hours Partially Mission Capable Maintenance- 
Unscheduled: total number of hours that the aircraft 
were Capable of performing at least one, but not all of 
their missions due to unscheduled maintenance 
requirements in each month. 


11. Hours Full Mission Capable Maintenance-Unscheduled: 
total number of hours that aircraft were capable of 
performing all of their missions but are not at optimum 
performance due to unscheduled maintenance requirements 
in each month. 


12. Maintenance Man Hour per Flight Hour: average 
number of hours of maintenance done per flight hour in 
each month. Derived by dividing total maintenance hours 
by@total h@ttzemt lown. 


The maintenance data was reduced somewhat. The number of 
landings was obviously highly correlated with the number of 
sorties, therefore the number of landings was omitted since 
the number of sorties provides essentially the same 
int OGmaG Om. The hours Equipment in Service was perfectly 
correlated with the average number of aircraft assigned since 
the total hours equipment in service is the average number of 
aircraft multiplied by the total number of hours in the month. 
Therefore the hours equipment in service was not included in 
the analysis. If a squadron had numerous missing data ina 
particular month that month was deleted from the data. And, 
if the amount of flight hours in a month was less than 100, 
then that month was deleted since that month was obviously not 
a normal operating month and may skew any results of the 


analysis. 
a7 Personnel Data 


The personnel data was provided by Headquarters Marine 


Corps and consisted of the number of each maintenance related 


GG 


Military Occupational Specialty in each squadron in each 
month. Eight squadrons were included in this data. The data 
provided was the number of each specialty, and was not 
compared with the squadron Table of Organization to determine 
if a squadron was manned at a level consistent with the Table 
of Organization. The data consisted of quarterly data from 
January 1990 to December 1992 and monthly data from February 
1993 to November 1993. The month of January 1993 was missing 
from the data. The following is the brief description of the 
provided Military Occupational Specialties: 

Pe sirecratt Mechanic: responsible for engine repair, 


daily inspection, and launching and recovering aircraft. 


2. Aircraft Maintenance Chief: senior enlisted person in 
maintenance department. Usually only a couple in entire 
Squadron, one as maintenance chief, responsible for 
overseeing the department, and one as a the maintenance 
control chief, responsible for assigning maintenance ona 
particular aircraft to the responsible work center. 


3. Aircraft Maintenance Administrative Clerk and Aircraft 
Maintenance Data Analysis Technician: responsible for 
tracking maintenance and preparing required reports. 


4. Aircraft Maintenance Hydraulics and Pneumatics 
Mechanic: responsible for maintenance of the hydraulic 
systems and aircraft body maintenance. 


5. Flight Equipment Marine: responsible for maintenance 
of aircrew personal flight equipment. 


pee sa rcrart Maintenance Ground Support Equipment 
Mechanic: responsible for maintenance on ground support 
equipment used in the maintenance of the aircraft. 


7. Aircraft Safety Equipment Mechanic: responsible for 
maintenance of ejection seats and environmental systems. 


8. Aircraft Communications/Navigation System Technician: 
responsible for maintenance of communications/navigation 
and related systems. 


9. Aircraft Electrical System Technician: responsible 
for maintenance of electrical systems. 


ae 


10. Avionics Maigitenance “Chver- senior enmli sequal 
avionics. Ga 71srone 


11. Aireraft Ordnance Mech eran: responsible for 
ordnance delivery systems and loading of ordnance. 


12. Aviation Ordnance Chief: senior enlisted Wiele@eeememres 
Gicy iSaionms 


All twelve specialties were included in the analysis, 
although it 1s doubtful that some of them would effect 
aircraft mishaps. The aircraft maintenance chief, avionics 
chief, and ordnance chief specialties probably will not be 
Significantly different between mishap and  non-mishap 
Squadrons since all squadrons have just one or two of these 
specialties and are almost always manned. The data analysis 
section, the flight equipment section, ground support section, 
and safety equipment section, probably will ieee 
Slgnificantly different between mishap and  non-mishap 
squadrons since maintenance performed by these sections is 
highly specialized and is rarely, if ever, considered a causal 


factor in an aircraft mishap. 


B. DATA REDUCTION 
1. One Month Lag 


All of the above data are contained in reports that are 
generated at the end of the month being reported upon. Hence 
this data is not useful in trying to predict a mishap in that 
month since the month is already past. Also, a squadron that 
has a mishap will sometimes drastically change their operating 
procedures, obviously effecting the maintenance reports for 
that month. For the preceding reasons the squadron reported 
maintenance figures for each month were used as independent 


variables to attempt to predict a mishap squadron in the next 


Te) 


month. Basically creating maintenance variables with a one 


month lag as predicting variables for a mishap in the month. 
2. Final Data 


The original data set contained approximately 432 
observations (nine squadrons x 48 months of data) that had 54 
mishap observations and 378 non-mishap observations. Fach 
observation consisted of a month with a binary dependent 
Meelable indicating if a mishap occurred or not, and 23 
possible independent predictor variables. After the above 
reductions in the data, the final data set used in the 
analysis contained 368 observations that had 44 mishap 
observations and 324 non-mishap observations. Each 
observation includes the binary dependent variable and 21 


possible independent predictor variables. 
3. Model Formulation 


The final data set and model of the problem can be 
considered similar to Anderson’s Iris Data made famous by 
Fisher [Ref. 5]. In that data set there were measurements 
from three varieties of flowers and the problem was to develop 
a model and a procedure that would classify a particular 
flower, as one of the three varieties. The data set consisted 
of a set of four measurements on each of 150 flowers; the 
sample contained 50 flowers of each variety of flower. So 
this data may be regarded as 150 four-dimensional observations 
in four-dimensional space. The goal of a model is to develop 
a function that maps the observations from four dimensional 
Space to some outcome space that will enable the 
classification of the flower in a particular category. Ise 
this example, by plotting petal length versus petal width, and 
coding each observation, an obvious clustering of type of 


flowers 1S shown that can be used to classify each flower. 


IRS, 


The final mishap data set 1S somewhat similar to the above 
example, but obviously more complex. The final data set was 
a set of 21 measurements on each of 368 separate monthly 
observations. The 21 measurements include all of the 
personnel and maintenance figures discussed previously, for 
that particular month. The sample contained 324 non-mishap 
monthly observations and 44 mishap monthly observations. The 
data can then be regarded as 368 twenty-one dimensional 


observations in twenty-one dimensional space. 
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IIl. DATA ANALYSIS 


A. APPROACH TO ANALYSIS 


The approach to analysis was to first perform a one- 
dimensional graphical marginal analysis of each independent 
predictor variable. A density trace from each population, 
mishap and non-mishap, for each independent predictor variable 
was superimposed upon each other to determine any significant 
differences in the two populations. As discussed earlier, if 
any of the independent predictor variables indicate a 
Significant difference between the mishap and non-mishap 
population, that variable or variables, could be used to 
discriminate an observation as a mishap or non-mishap 
squadron. 

Following the one-dimensional analysis a two-dimensional 
graphical analysis of the independent predictor variables will 
be performed to determine any pair of predictor variables that 
can be used to classify a squadron as a mishap or non-mishap 
squadron. All pairs of the possible independent predictor 
variables will be plotted in coded scatter plots to determine 
which pairs of variables could possibly be used to classify a 
Squadron as a mishap squadron. If any of the coded scatter 
plots show a clustering of mishap or non-mishap observations, 
then these pairs of independent variables could possibly be 
used to discriminate between mishap or a non-mishap squadron. 

Following the one and two-dimensional graphical analysis 
the independent predictor variables will be analyzed in higher 
dimensions with the multivariate techniques of principal 
components and logistic regression to attempt develop the 
predictive statistical model. These techniques will discover 
any higher order relationship that may be used to classify a 


squadron as a mishap or non-mishap squadron. 


Za 


All graphical output was produced using IBM’s A Graphical 
Statistical System (AGSS) [Ref. 6] on a 486DX-50 personal 


computer. 


B. PERSONNEL DATA ANALYSIS 


The twelve military occupational specialties considered 
were plotted on density trace plots to determine if there was 
a first order significant difference in the distributions of 
the military occupational specialties between a mishap 
squadron and a non-mishap squadron manning level. All of the 
plots reveal that there is no discernable area (marginal) 
effect between a mishap squadron and a non-mishap squadron. 
All of the density traces of the personnel data are reproduced 
in Appendix A. A representative plot of the Aircraft Mechanic 
specialty 1s shown in Figure 6. As can be seen, there 1s not 
a significant difference in the density plots of aircraft 
mechanics assigned to mishap and non-mishap squadrons. 

The manning level results are undoubtedly highly 
influenced by the fact that most of the personnel data was 
reported as quarterly figures. Since the same number of 
personnel was reported for each month of that quarter, the 
changes between mishap and non-mishap squadrons in each month 
was not distinguishable. 

It bears repeating that the personnel data was compared by 
the total number of individuals in each specialty. TaNe 
number was not compared to the Table of Organization since the 
goal of the thesis was to distinguish between a mishap and 
non-mishap, and not to determine if a squadron was manned at 
Table of Organization level. This analysis also had no way of 
analyzing the experience level of the individuals assigned to 
different squadrons. It was assumed that the experience level 
would be similar among squadrons, which may or may not be 


true. And obviously, the experience level among the 
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Figure 6. Density traces of Aircraft Mechanics 
assigned to each squadron. 


maintainers could influence the chances of the squadron having 
a mishap. 

Based on the above one-dimensional analysis, the personnel 
data was not considered significant and therefore was not 


included in any further analysis. 


C. MAINTENANCE DATA ANALYSIS 


The marginal analysis of the ten possible maintenance 
predictor variables was done by plotting density traces of 
each variable to determine if there was a first order 
Significant difference in the distributions of the variable 
between a mishap squadron and a non-mishap squadron. All of 


the density trace plots of the maintenance independent 
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predictor variables are reproduced in Appendix B. None of the 
plots revealed any discernable area (marginal) effect in one 
dimension between a mishap and non-mishap squadron. A 
representative plot of Maintenance Man Hours per Flight Hour 
1s shown in Figure 7. The figure clearly shows that there is 
not a significant difference between the maintenance man hours 
per flight hour per month in the mishap Squadron popaiapmem 
and non-mishap squadron population. The majority eee 
observations fall between 10 and 25 maintenance man hours per 
flight hour with no way of separating the mishap from the non- 
mishap observations. 

The one-dimensional analysis of all maintenance 


independent predictor variables did not produce = any 
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Figure 7. Density trace of Maintenance Man Hours 
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Significant differences that could be used to classify a 
Squadron as a mishap or non-mishap squadron, so all of the 
independent maintenance predictor variables were retained and 
an analysis of a two-dimensional relationship was performed. 

To determine any two-dimensional relationship, all 
possible pairs of the ten independent maintenance predictor 
variables were plotted in coded scatter plots. A coded 
scatter plot 1s a technique in which each independent variable 
can be plotted against all other independent variables to 
determine any second order interaction of variables that could 
be used in classifying a squadron as a mishap or non-mishap 
Squadron. A coded scatter plot will show the relationship 
between the two predictor variables, as well as any possible 
relationship to predict a mishap, i.e., separate clustering of 
observations that can discern between mishap and non-mishap 
squadrons. The coded scatter plots showed no discernable area 
of effect that could be used in discriminating between a 
mishap and non-mishap squadron. A representative plot is 
Shown in Figure 8 with all the possible pairs of plots 
reproduced in Appendix C. The coded scatter plots show mishap 
and non-mishap months as well as identifying the training 
Squadron versus the regular squadrons. The training squadron 
1s shown separately to determine if the training environment 
1s possibly significant in determining mishaps. 

In Figure 8 the total Flight Hours of a squadron are 
plotted against the Maintenance Man Hours per Flight Hour. It 
1S obvious that the training squadron produces more flight 
hours each month and has a slightly higher maintenance man 
hours per flight hour. But there are no discernable area of 
effect exclusive to a mishap or non-mishap squadron. Ideally 
all the mishap observations would be clustered together, 
separated from a cluster of all the non-mishap observations. 

From the above two-dimensional analysis several 


transformations of the original independent variables were 
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Figure 8. Coded scatter plot of Maintenance Man 
Hours per Flight Hour versus Total Fiagaietowse-. 


suggested. As could be expected, the total number of flight 
hours and total number of sorties a squadron flies in a 
particular month are highly correlated, hence are providing 
the same information. Therefore number of sorties was dropped 
because the total flight hours provides essentially the same 
information as total number of sorties. 

Since the training squadron 1s always assigned more 
aircraft and the other squadrons total assigned aircraft can 
vary significantly, the total flight hours may be skewed 
somewhat. Therefore the total flight hours flown in each 
month were divided by the total aircraft assigned that month, 
to form a new univariate independent predictor variable of 
average flight hours per aircraft assigned in each month. 


This new independent predictor variable is basically an 
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ie@gieator “Oro the-~Mtilization rate of the aircraft ina 
squadron. 

Many of the different maintenance predictor variables were 
spread over a wide range because of a few unusually high or 
low reported maintenance months. These months could not be 
considered outliers, so all maintenance predictor variables 
were transformed by taking the logarithm of the variable, 
providing a more presentable plot, without changing any of the 
existing relationships. 

As before, a one-dimensional marginal analysis was 
performed on the transformed independent predictor variables. 
A representative density trace of Flight Hours per Aircraft 1s 
shown in Figure 9, with the remaining density traces of the 
transformed independent predictor variables reproduced in 
Appendix D. The plot clearly shows, as well as all other 
plots, that there is no discernable area of effect between 
flight hours per aircraft in the mishap squadron population 
and non-mishap squadron population. 

A two-dimensional analysis was then performed on the 
transformed predictor variables using coded scatter plots to 
determine any significant pairs of predictor variables. The 
elght transformed independent maintenance predictor variables 
were plotted in a coded scatter plot so that each independent 
variable could be plotted against all other independent 
variables to determine any pair of variables that could be 
used in classifying a squadron as a mishap or non-mishap 
squadron. The plot showed no discernable area of effect that 
could be used in discriminating between a mishap and non- 
mishap squadron. A representative coded scatter plot of the 
logarithm of Maintenance Man Hours per Flight Hour versus 
Flight Hours per Aircraft 1S shown in Figure 10, with the 
remaining coded scatter plots of the transformed independent 
predictor variables reproduced in Appendix E. The plot shows 


mishap and non-mishap months as well as identifying the 
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Figure 9. Density trace of Flight Hours per 
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training squadron versus the regular squadrons. The training 
Squadron 1S shown separately to determine if the training 
environment 1S significant in determining mishaps. Included 
in each of these plots is a locally weighted regression 
scatter plot smoothing (LOWESS) function to help indicagewan, 
relationship of the two independent variables. [Ref. 7] 
Except for a few extreme months, the utilization rate and log 
of Maintenance Man hours per Flight Hour of all the 
observations, both mishap and non-mishap, are tightly 
clustered in one group. But there 1s no discernable 
clustering of the mishap observations separated from the non- 
mishap observations. It 1S somewhat interesting to note that 
aS utilization rate goes up the Maintenance Man hours per 


Flight Hour decrease, probably due to the fact that the 
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Figure 10. Coded scatter plot of the logarithm 
of Maintenance Man Hours per Flight Hour versus 
Hours per Aircraft. 


alrcraft are up and flying more and not breaking or possibly 
less time to perform maintenance. 

Based on the above one and two dimensional analysis of the 
Original and transformed predictor variables, there does not 
appear to be any discernable relationships that could be used 
in classifying a squadron at risk of having a mishap based 
upon the existing monthly maintenance reports. Since none of 
the independent variables were determined to be significant in 
the above graphical analysis, all of the transformed 
independent maintenance variables were retained as possible 
predictor variables for an analysis of higher order 


Mpieoractions. 
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IV. FURTHER ANALYSIS 


A. PRINCIPAL COMPONENTS 


Since the initial graphical analysis did not reveal any 
discernable first or second order discriminate function, the 
method of principal components was used to determine if any 
linear combination of variables exists that could be used to 
classify a high risk squadron. The principal component are 
the independent linear combinations of the existing variables 
that maximize the variances. 

The principal components method in effect rotates the 
coordinate axes of the data to a new coordinate system that 
has inherent statistical properties. This 1S a way of 
reducing the number of variables to be considered by 
discarding linear combinations which have small variances and 
study only those with large variances. The idea is to focus 
on the largest variances between the variables to help 
discriminate between mishap and non-mishap squadrons. [Ref. 8] 

The data was divided into two separate data sets, a matrix 
M, containing all the maintenance independent predictor 
variables from the mishap observations and a matrix N, 
containing all the maintenance independent predictor variables 
from the non-mishaps observations. The non-mishap 
observations were used as the baseline since the objective of 
the thesis was to discriminate between mishap and non-mishap 
observations. The principal components method was applied to 
the data of non-mishap observations to produce a matrix of 
principal component coefficients, P. The transpose of this 
matrix was then multiplied by both matrices M and N, therefore 
producing matrices whose elements are the baseline component 
values of the mishap and non-mishap data, PM = M’and PN =N’. 
The values of the original variables are projected onto the 
baseline principal axes. To see if these component values are 


useful for classifying squadrons as mishap and non-mishap 


oul 


squadrons, the distributions of the first principal component 
values are compared for significant differences. To compare 
the principal components, the first principal components of 
each of the component value matrices was standardized using 


the mean and standard deviation of the non-mishap observations 


/ xia /} 
=n 7 
(6) 
/ - I} 
oe (m';, ~ n°) 
il ee 
n’ 
where, 
u’,, 18 the standardized first principal Componenrmes 


the non-mishap predictor variables. 


v’,, 1s the standardized first principal component of 
the mishap predictor variables. 


n’’ and s,, are the average and standard deviation of 
the first principal component of the non-mishap 
predictor variables. 


n’,, and m’,;, are the individual entries in the first 

column of the two principal component matrices. 

These standardized first principal components are then 
Superimposed on a density trace plot. Any significant 
difference in the two densities of the plot would indicate a 
transformation of axes that could be exploited to classify the 
observations as mishap or non-mishap. 

Figure 11 shows the resulting standardized first principal 
component plot of the transformed independent predictor 
variables. Although there is some difference shown, there is 
no discernable difference that could be used to discriminate 
a mishap and non-mishap squadron. Therefore the method of 
principal components indicates that there may not exist a 
linear additive model of the independent predictor variables 
that could be used to classify a mishap or non-mishap 


squadron. 
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Figure 11. Density trace of standardized list 
principal components of the transformed data. 


B. LOGISTIC REGRESSION 


To continue to develop a predictive statistical model the 
method of logistic regression was pursued. Logistic 
regression uses a linear logistic transformation function 
that calculates the logarithm of the odds of an event 
@Seemrrning, Or the ratio of the probability of success to the 
probability of failure. That is, the likelihood that an event 
will occur given a particular set of predictor variables. The 


logit model takes on the form [Ref. 9] 


oe 


ae = 1 : e 7 (a+b X;) 
Or 
P. 
Me ay | = Ce pee, (7) 
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where P, = probability of an event occuring 
xX, = a@Geriouces Of an evenc 


="GOCTILITECLTCNCSEVeCe Eom 
= scalar. 


Although the individual probability of an event occurring, 
P,, are not known, the information for each observation is 
whether an event occurred or did not occur. The measured 
dependent variable is Y, = 1, 1f an event occurred, and Y, = 
0, 1f no event occurred. This dependent variable is used with 
a maximum likelihood estimation for the logit model to 
estimate @ and & for the model. [Ref. 10] Results from the 
predictive statistical model provide an estimated forecast of 
the probability of an event observation occurring based upon 
a particular set of attributes. Using a selected critical 
probability, any set of attributes can be classified as an 
event or non event observation based upon the log odds 
calculated by the predictive model. The critical probability 
should be selected so that type I errors are minimized while 
maintaining an accurate predictive model. 

A logistic regression of the aircraft mishap data was 
performed in attempt to produce a predictive statistical model 
to forecast aircraft mishaps. Figure 12 shows” the 


superimposed plot of the log odds of the mishap and non-mishap 
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observations. In this plot the forecasted log odds is the 
odds of each observation being classified as a non-mishap 
observation. For example, given a set of predictor variables 
from a particular squadron, the plot shows the log odds of 
that squadron being classified as a non-mishap squadron. As 
can be seen, the log odds of classifying the observations as 
a non-mishap squadron fall between 0.73 and 0.99, for both 
mishap and non-mishap observations. This indicates that the 
predictive model has a high probability of classifying every 
observation aS a non-mishap. There 1S no critical probability 
that would partition the decision space that will result in an 
acceptable predictive statistical model while minimizing 


Sa Ols « 


° 
° ° 


© NO MISHAP OBSERVATION 
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0.8 
LOG ODDS OF NO MISHAP OCCURING 





Figure 12. Plot of the log odds of non-mishap and 
mishap observations produced by logistic regression. 


This predictive statistical model is obviously not useful 


Since to forecast a high percentage of mishaps, almost all of 


Sie, 


the squadrons would have to be told that they are at a high 
risk of having a mishap. Obviously, if all the squadrons are 
told that they are at risk, then the predictive statistical 


model will soon be disregarded. 


C. DATA MANIPULATION 


Since all of the preceding detailed analysis failed to 
provide an acceptable predictive statistical model to forecast 
mishaps, an attempt to define a model was made by using 
different subsets of the original data. As stated in the data 
chapter, all mishaps were included in the original analysis, 
except for birdstrike mishaps. 

Since all the variables were maintenance related, the 
first transformation eliminated all pilot error mishap 
observations, so that only mishaps that involved material 
failure or maintenance personnel error were analyzed. All 
other observations were considered as non-mishap observations. 

The second transformation took the above transformation 
and further eliminated all Class B and Class C mishap 
observations. This transformation resulted ina data set of 
maintenance related Class A mishaps. All other observations 
were considered as non-mishap observations. 

Neither of the above transformations lead to any 


difference in the outcome of the analysis. 
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V. SUMMARY, CONCLUSIONS, AND RECOMMENDATIONS 


A. SUMMARY 


This thesis has examined the relationship between existing 
monthly maintenance reports and aircraft mishaps. The 
reported monthly maintenance and personnel variables were 
analyzed to determine if any combination of the variables 
could be used to describe a predictive statistical model that 
can classify a squadron as a mishap or non-mishap squadron in 
the upcoming month. 

Based upon a graphical analysis there were no obvious one 
or two dimensional relationships that could be used to 
Classify a mishap squadron. The further techniques of 
principal components and logistic regression did not produce 
any higher order relationships that could be used to classify 


a mishap squadron. 


B. CONCLUSIONS 


Based on this particular analyzed data there apparently is 
no relationship between existing monthly maintenance reports 
and aircraft mishaps. This result might indicate that with 
this particular data there 1s no existing relationship, or it 
might indicate that a monthly generated report may not be 
Heherul 1n predicting an aircraft mishap. The fact that the 
data 1S reported at the end of the month could possible 
conceal any subtle useful changes or indications that could be 


exploited to forecast aircraft mishaps. 


C. RECOMMENDATIONS 


This thesis indicates that there 1s no relationship 
between existing monthly maintenance reports and aircraft 


mishaps that could be used in. developing a predictive 


1 


Statistical model to classify a squadron as a mishap or non- 
mishap squadron. 

Two alternative recommendations are evident. The first 
alternative 1S to accept that there may be no exploitable 
relationship between monthly maintenance reports and aircraft 
mishaps and focus elsewhere to determine a predictive 
Statistical model that forecasts aircraft mishaps. The second 
alternative recommendation is that further analysis be done, 
possibly attempting to use daily maintenance reports versus 
monthly maintenance reports, to describe a predictive 


Statistical model that forecasts aircraft mishaps. 
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